pglog

un outil d'analyse de logs

PgSessions - Paris
2021-11-17

Qui suis-je?

Image by Anemone123 from Pixabay

Pourquoi réinventer la roue?

  • Outils existants pas assez flexibles
    • Pas d'analyse sur une période spécifique
    • Pas de statistiques précises
Image by Andy Giraud from Pixabay

Pourquoi réinventer la roue?

  • Exemples:
    • Analyse des pics de production
    • Analyse en profondeur des fichiers temporaires
Image by Andy Giraud from Pixabay

Comment l'installer?

  • Ne pas utiliser en production
  • Simple extension Tout est expliqué dans les fichier README.md
  • Dockerfile disponible
Image by Harry Strauss from Pixabay

Utilisons docker

docker build -t pglog_postgresql .
docker run -d --rm -P --name  pglog pglog_postgresql
Image by chiara tiberti from Pixabay

Utilisons docker

Comment l'utiliser?

  • Ne pas utiliser en production
  • Sur des logs pour Postgres > 9.0
  • Sur des logs au format csv
  • Sur des logs avec des informations
Image by Markus Winkler from Pixabay

La table pglog

  • \d pglog
  • table partitionnnée
  • FDW ou table physique
Image by Jeong Eun Lee from Pixabay

Rapport d'erreurs

select * from error_report;
 error_severity |   nb    | percentage
----------------+---------+------------
 FATAL          |       8 |       0.00
 ERROR          |     106 |       0.00
 WARNING        |       3 |       0.00
 LOG            | 6055776 |     100.00
(4 rows)
Image by Ryan McGuire from Pixabay

Rapport détaillé sur un niveau d'erreur

select * from error_analyze_fatal;
                       message                       | count |            min             |            max
-----------------------------------------------------+-------+----------------------------+----------------------------
 terminating connection due to administrator command |     7 | 2021-08-25 13:01:32.992+00 | 2021-08-27 09:12:44.884+00
 connection to client lost                           |     2 | 2021-08-25 12:20:24.044+00 | 2021-08-25 12:20:24.044+00
(2 rows) 
Image by Ryan McGuire from Pixabay

Rapport sur l'autovacuum

select * from autovacuum_report
order by vacuum_count desc fetch first 5 rows only;
 schema_name |  table_name   | vacuum_count | analyze_count
-------------+---------------+--------------+---------------
 pg_catalog  | pg_statistic  |           24 |             0
 pg_catalog  | pg_attribute  |           15 |            14
 pglog       | postgresqlwed |            8 |             7
 pg_catalog  | pg_class      |            7 |             7
 pg_catalog  | pg_type       |            6 |             6
(5 rows) 
Image by Ryan McGuire from Pixabay

Statistiques de checkpoint

select * from checkpoints_stats;
§
 avg_checkpoint_freq | median_checkpoint_freq | avg_checkpoint_duration | median_checkpoint_duration
---------------------+------------------------+-------------------------+----------------------------
 06:01:55.491981     | 00:15:00.809           | 00:00:32.447288         | 00:00:06.6675
(1 row) 
Image by Ryan McGuire from Pixabay

Plus de précisions

pglog=# select * from checkpoints_stats_by_hour;

 date_part | avg_checkpoint_freq | median_checkpoint_freq | avg_checkpoint_duration | median_checkpoint_duration
-----------+---------------------+------------------------+-------------------------+----------------------------
         6 | 03:31:43.989333     | 00:04:59.941           | 00:00:06.375            | 00:00:04.431
         7 | 22:36:05.281917     | 00:15:00.809           | 00:00:17.00225          | 00:00:03.916
         8 | 00:21:22.5615       | 00:15:01.661           | 00:00:17.9445           | 00:00:09.3765
         9 | 00:39:21.674167     | 00:30:17.2415          | 00:00:33.430833         | 00:00:11.2735
        10 | 00:16:09.9175       | 00:16:09.9175          | 00:00:06.032            | 00:00:06.032
        11 | 00:37:25.941667     | 00:12:22.328           | 00:01:40.044667         | 00:02:29.37
        12 | 01:11:27.7846       | 00:40:00.957           | 00:01:05.7842           | 00:00:16.016
        13 | 00:15:19.55725      | 00:07:56.6295          | 00:01:18.24225          | 00:01:21.253
        15 | 04:13:58.037        | 04:13:58.037           | 00:02:30.109            | 00:02:30.109
        16 | 02:58:23.038333     | 00:25:00.427           | 00:00:05.165667         | 00:00:05.012
        17 | 00:15:00.34025      | 00:07:30.346           | 00:00:04.27425          | 00:00:03.195
        19 | 01:02:31.026        | 01:02:31.026           | 00:00:06.4115           | 00:00:06.4115
        20 | 00:09:59.441        | 00:09:59.441           | 00:00:06.241            | 00:00:06.241 
Image by Ryan McGuire from Pixabay

Fichiers temporaires

pglog=# select * from tempfile_report\gx

-[ RECORD 2 ]--+----------------------------------------------------------------------
min            | 8192 kB
max            | 374 GB
global_average | 57 GB
10%            | 5720 MB
20%            | 5720 MB
30%            | 5720 MB
40%            | 6000 MB
50%            | 32 GB
60%            | 33 GB
70%            | 36 GB
80%            | 142 GB
90%            | 144 GB
100%           | 374 GB 
Image by Ryan McGuire from Pixabay

Requêtes utilisant des Fichiers temporaires

select * from tempfile_queries order by frequency
 fetch first 2 rows only\gx

-[ RECORD 1 ]--------------+----------------------------------------------------------------------
frequency                  | 154
query_tempfilesize_median  | 6000 MB
query_tempfilesize_average | 30 GB
total_size                 | 4655 GB
query                      | call create_mylogs (?,true); 
Image by Ryan McGuire from Pixabay

Requêtes lentes

pglog=# select *
from top_n_slowest_queries
order by "average" desc
fetch first 1 rows only;

-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
query   | with queries as ( select split_part(message, ? ) ) select query, count ? , avg(duration) as average, percentile_disc ? within group (order by duration) as median from queries group by query order by average desc limit ?
count   | 1
average | 26384.688000000000
median  | 26384.688
Image by Ryan McGuire from Pixabay

Statistiques des sessions

select * from sessions_stats ;

 min_duration | max_duration  |  avg_duration   | user_name | database_name 
--------------+---------------+-----------------+-----------+---------------
 00:00:00.033 | 00:41:05.38   | 00:05:32.636933 | postgres  | mybench
 00:02:41.311 | 156:33:45.673 | 43:09:49.903    | postgres  | pglog
 00:00:03.538 | 00:00:06.231  | 00:00:04.8845   | postgres  | postgres
 00:00:00.003 | 00:00:00.013  | 00:00:00.007333 | postgres  | template1
Image by Ryan McGuire from Pixabay

Rapport de connexions

select * from connections_report ;

 user_name | database_name |        frequency        | connection_number 
-----------+---------------+-------------------------+-------------------
 postgres  | mybench       | 00:23:57.488429         |                14
 postgres  | pglog         | 10 days 12:05:08.038667 |                 3
 postgres  | postgres      | 00:04:20.521            |                 1
 postgres  | template1     | 00:00:00.519            |                 2
Image by Ryan McGuire from Pixabay

Ce qui est vraiment important

  • Votre système est spécifique
  • Avec ses problèmes spécifiques
  • Faites vos propres requêtes!
https://gitlab.com/l_avrot/pglog
Image by Ryan McGuire from Pixabay

Des questions?

Image by Tumisu from Pixabay